Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Prompting LLMs for complex tasks (e.g., building a trip advisor chatbot) needs humans to clearly articulate customized requirements (e.g., “start the response with a tl;dr”). However, existing prompt engineering instructions often lack focused training on requirement articulation and instead tend to emphasize increasingly automatable strategies (e.g., tricks like adding role-plays and “think step-by-step”). To address the gap, we introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements during prompting. We implement ROPE through an assessment and training suite that provides deliberate practice with LLM-generated feedback. In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training (20% vs. 1% gains), a gap that automatic prompt optimization cannot close. Furthermore, we demonstrate a direct correlation between the quality of input requirements and LLM outputs. Our work paves the way to empower more end-users to build complex LLM applications.more » « lessFree, publicly-accessible full text available April 24, 2026
-
Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely recognized as “AI pair programmers,” it becomes increasingly important to train students on evaluating and debugging the LLM-generated code. In this work, we introduce HypoCompass, a novel system to facilitate deliberate practice on debugging, where human novices play the role of Teaching Assistants and help LLM-powered teachable agents debug code. We enable effective task delegation between students and LLMs in this learning-by-teaching environment: students focus on hypothesizing the cause of code errors, while adjacent skills like code completion are offloaded to LLM-agents. Our evaluations demonstrate that HypoCompass generates high-quality training materials (e.g., bugs and fixes), outperforming human counterparts fourfold in efficiency, and significantly improves student performance on debugging by 12% in the pre-to-post test.more » « less
-
AbstractThe relative effectiveness of reflection either through student generation of contrasting cases or through provided contrasting cases is not well‐established for adult learners. This paper presents a classroom study to investigate this comparison in a college level Computer Science (CS) course where groups of students worked collaboratively to design database access strategies. Forty‐four teams were randomly assigned to three reflection conditions ([GEN] directive to generate a contrasting case to the student solution and evaluate their trade‐offs in light of the principle, [CONT] directive to compare the student solution with a provided contrasting case and evaluate their trade‐offs in light of a principle, and [NSI] a control condition with a non‐specific directive for reflection evaluating the student solution in light of a principle). In the CONT condition, as an illustration of the use of LLMs to exemplify knowledge transformation beyond knowledge construction in the generation of an automated contribution to a collaborative learning discussion, an LLM generated a contrasting case to a group's solution to exemplify application of an alternative problem solving strategy in a way that highlighted the contrast by keeping many concrete details the same as those the group had most recently collaboratively constructed. While there was no main effect of condition on learning based on a content test, low‐pretest student learned more from CONT than GEN, with NSI not distinguishable from the other two, while high‐pretest students learned marginally more from the GEN condition than the CONT condition, with NSI not distinguishable from the other two. Practitioner notesWhat is already known about this topicReflection during or even in place of computer programming is beneficial for learning of principles for advanced computer science when the principles are new to students.Generation of contrasting cases and comparing contrasting cases have both been demonstrated to be effective as opportunities to learn from reflection in some contexts, though questions remain about ideal applicability conditions for adult learners.Intelligent conversational agents can be used effectively to deliver stimuli for reflection during collaborative learning, though room for improvement remains, which provides an opportunity to demonstrate the potential positive contribution of large language models (LLMs).What this paper addsThe study contributes new knowledge related to the differences in applicability conditions between generation of contrasting cases and comparison across provided contrasting cases for adult learning.The paper presents an application of LLMs as a tool to provide contrasting cases tailored to the details of actual student solutions.The study provides evidence from a classroom intervention study for positive impact on student learning of an LLM‐enabled intervention.Implications for practice and/or policyAdvanced computer science curricula should make substantial room for reflection alongside problem solving.Instructors should provide reflection opportunities for students tailored to their level of prior knowledge.Instructors would benefit from training to use LLMs as tools for providing effective contrasting cases, especially for low‐prior‐knowledge students.more » « less
An official website of the United States government
